Rl: Fast Reinforcement Learning via Slow Reinforcement Learning

نویسندگان

  • Yan Duan
  • John Schulman
  • Xi Chen
  • Peter L. Bartlett
  • Ilya Sutskever
  • Pieter Abbeel
چکیده

Deep reinforcement learning (deep RL) has been successful in learning sophisticated behaviors automatically; however, the learning process requires a huge number of trials. In contrast, animals can learn new tasks in just a few trials, benefiting from their prior knowledge about the world. This paper seeks to bridge this gap. Rather than designing a “fast” reinforcement learning algorithm, we propose to represent it as a recurrent neural network (RNN) and learn it from data. In our proposed method, RL, the algorithm is encoded in the weights of the RNN, which are learned slowly through a general-purpose (“slow”) RL algorithm. The RNN receives all information a typical RL algorithm would receive, including observations, actions, rewards, and termination flags; and it retains its state across episodes in a given Markov Decision Process (MDP). The activations of the RNN store the state of the “fast” RL algorithm on the current (previously unseen) MDP. We evaluate RL experimentally on both small-scale and large-scale problems. On the small-scale side, we train it to solve randomly generated multi-armed bandit problems and finite MDPs. After RL is trained, its performance on new MDPs is close to human-designed algorithms with optimality guarantees. On the largescale side, we test RL on a vision-based navigation task and show that it scales up to high-dimensional problems.

منابع مشابه

RL$^2$: Fast Reinforcement Learning via Slow Reinforcement Learning

Deep reinforcement learning (deep RL) has been successful in learning sophisticated behaviors automatically; however, the learning process requires a huge number of trials. In contrast, animals can learn new tasks in just a few trials, benefiting from their prior knowledge about the world. This paper seeks to bridge this gap. Rather than designing a “fast” reinforcement learning algorithm, we p...

متن کامل

Reinforcement Learning in Neural Networks: A Survey

In recent years, researches on reinforcement learning (RL) have focused on bridging the gap between adaptive optimal control and bio-inspired learning techniques. Neural network reinforcement learning (NNRL) is among the most popular algorithms in the RL framework. The advantage of using neural networks enables the RL to search for optimal policies more efficiently in several real-life applicat...

متن کامل

Reinforcement Learning in Neural Networks: A Survey

In recent years, researches on reinforcement learning (RL) have focused on bridging the gap between adaptive optimal control and bio-inspired learning techniques. Neural network reinforcement learning (NNRL) is among the most popular algorithms in the RL framework. The advantage of using neural networks enables the RL to search for optimal policies more efficiently in several real-life applicat...

متن کامل

Issues in Putting Reinforcement Learning onto Robots

There has recently been a good deal of interest in robot learning. Reinforcement Learning (RL) is a trial and error approach to learning that has recently become popular with roboticists. This is despite the fact that RL methods are very slow, and scale badly with the size of the state and action spaces, thus making them diicult to put onto real robots. This paper describes some work I have bee...

متن کامل

Searching for Plannable Domains can Speed up Reinforcement Learning

Reinforcement learning (RL) involves sequential decision making in uncertain environments. The aim of the decision-making agent is to maximize the benefit of acting in its environment over an extended period of time. Finding an optimal policy in RL may be very slow. To speed up learning, one often used solution is the integration of planning, for example, Sutton’s Dyna algorithm, or various oth...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

متن کامل
عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016